rank | frequency | n-gram |
---|---|---|
1 | 4944 | -ه |
2 | 3929 | -ن |
3 | 3053 | -ی |
4 | 2135 | -، |
5 | 2107 | -ˇ |
rank | frequency | n-gram |
---|---|---|
1 | 1572 | -ان |
2 | 912 | -نه |
3 | 765 | -نˇ |
4 | 598 | -ته |
5 | 546 | -ده |
rank | frequency | n-gram |
---|---|---|
1 | 307 | -انه |
2 | 255 | -انˇ |
3 | 208 | -ران |
4 | 202 | -انی |
5 | 201 | -ؤنˇ |
rank | frequency | n-gram |
---|---|---|
1 | 135 | -نوار |
2 | 81 | -ستان |
3 | 70 | -اران |
4 | 57 | -انان |
5 | 44 | -ترین |
rank | frequency | n-gram |
---|---|---|
1 | 133 | -انوار |
2 | 23 | -ستانˇ |
3 | 22 | -رستان |
4 | 20 | -داران |
5 | 18 | -هٰنˇ |
The tables show the most frequent letter-N-grams at the ending of words for N=1…5. Everything runs in parallel to 2.2.5 Most frequent word beginnings. The aim is suffix detection instead of affix detection.
For N=3:
SELECT @pos:=(@pos+1), xx.* from (SELECT @pos:=0) r, (select count(*) as cnt ,concat("-", right(word,3)) FROM words WHERE w_id>100 group by right(word,3) order by cnt desc) xx limit 5;
2.2.5 Most frequent word beginnings